*Target (protein/gene name):Protein Tyrosine Phosphatase
*NCBI Gene # or RefSeq#: had a hard time finding when searching backwards from protein ID
*Protein ID (NP or XP #) or Wolbachia#: XM_001914124.1

*Organism (including strain): Entamoeba histolytica HM-1: IMSS
Etiologic Risk Group (see link below): Appendix B-II-C. Risk Group 2 (RG2)-Parasitic Agents
*Background/Disease Information (sort of like the Intro to your Mini Research Write up):
Entamoeba histolytica is an anaerobic parasitic protozoan that causes amoebic colitis and amoebiasis, the second leading cause of death from a parasitic disease in the world. The protozoan is transmitted in fecal matter through direct contact or food/water contamination. When the organism reaches the intestines, it secretes the phosphatase, which then proceeds to catalyze the p-nitrophenyl phosphate (pNPP), leading to hydrolysis at acidic pH levels. This kills host cells on contact and leads to tissue death. It is also possible that the organism can travel to the liver and cause abscesses in the organ. Symptoms include diarrhea or dysentery, fatigue, weight loss, fever, and abdominal pain. Some antibiotics can be used in the case of a liver abscess but there is no vaccine for amoebiasis in general. Prevention can be taken in the form of food, water, and personal hygiene. The disease occurs mostly in areas with poor sanitation around the world.
https://www.neb.com/tools-and-resources/feature-articles/parasitic-infections-in-humans
http://www.sciencedirect.com/science/article/pii/S002075190800146X
http://iai.asm.org/content/70/4/1816.short
http://www.sciencedirect.com/science/article/pii/S0020751903000298

Essentiality of this protein: The protein is thought to be a key factor in causing host cell death.
Complex of proteins?: Just secreted phosphotyrosine phosphatase
Druggable Target: Research is currently validating the essentiality of the protein, and shows the ability to potentially inhibit it.

*EC#: 3.1.3.48
Link to BRENDA EC# page: http://www.brenda-enzymes.org/php/result_flat.php4?ecno=3.1.3.48
PTPMechanism.jpg
Figure 1: Mechanism schematic for protein tyrosine phosphatase.

Enzyme Assay information (spectrophotometric, coupled assay ?, reagents): Previous research measured using a spectrophotometer (405 nm). Used pNPP, NaOH, and o-phospho-l-tyrosine (P-Tyr)
-- link to Sigma (or other company) page for assay or assay reagents (substrates):
http://www.sigmaaldrich.com/catalog/search?interface=All&term=p-nitrophenyl%20phosphate&lang=en&region=US&focus=product&N=0+220003048+219853269+219853286
-- link (or citation) to paper that contains assay information: http://www.sciencedirect.com/science/article/pii/S0020751903000298
-- List cost and quantity of substrate reagents and supplier
-pNPP assay: $314, Sigma, 1000 well assays/100 tube assays
-o-phospho-l-tyrosine (P-Tyr): $76.20, Sigma, 250 mg

Structure Available (PDB or Homology model)

*Solved structures available: 3IDO, 3ILY, 3JS5, 3JVI

-- PDB # or closest PDB entry if using homology model: 3JS5
-- For Homology Model option:
---- Show pairwise alignment of your BLASTP search in NCBI against the PDB
BLAST.jpg
Figure 2: Side-by-side BLASTP comparison of 3JS5 and human phosphotyrosyl phosphotase.

---- Query Coverage:76%
---- Max % Identities:36%
---- % Positives: 52%
---- Chain used for homology: Chain A, Crystal Structure of A Human Low Molecular Weight Phosphotyrosyl Phosphatase. Implications for Substrate Specificity.

Current Inhibitors: Ammonium molybdate, sodium tungstate, sodium o-vanadate
Expression Information (has it been expressed in bacterial cells): It has been expressed in humans, rats, cattle, and yeast. Cells have been cultured in TYI-S-33.
Purification Method: E. histolytica was grown in medium, then the excretory phosphatase was separated from the supernatant. The supernatants were concentrated using a pressure ultrafiltration (Amicon PM 30 membrane) and stirred cell apparatus at 4°
C. This was then diluted with an equilibration buffer and passed through a purification column (Con A-Sepharose). The acid phosphatase was then eluted with a buffer (0.5 M α-methyl-d-mannoside). Samples that showed enzyme activity were passed through another column (DEAE-Cellulose), washed with an equilibration buffer, and eluted with a sodium chloride gradient.
http://www.sciencedirect.com/science/article/pii/S0020751903000298)
Image of protein (PyMol with features delineated and shown separately):
3js5_bio_r_500.jpg
Figure 3: Crystal structure of protein tyrosine phosphatase from Entamoeba histolytica with Hepes in the active site.
http://www.rcsb.org/pdb/images/3js5_bio_r_500.jpg

*Amino Acid Sequence (paste as text only - not as screenshot or as 'code'):
mahhhhhhmgtleaqtqgpgsmkllfvclgnicrspaaeavmkkviqnhhltekyicdsagtcsyhegqqadsrmrkvgksrgyqvdsisrpvvss
dfknfdyifamdndnyyelldrcpeqykqkifkmvdfcttikttevpdpyyggekgfhrvidiledacenliikleegklin
*length of your protein in Amino Acids: 178.........157
Molecular Weight of your protein in kiloDaltons using the Expasy ProtParam website: 20351.1...........18067.7
Molar Extinction coefficient of your protein at 280 nm wavelength: 13785
Of shortened sequence:
<span style="background-color: #ffffff; color: #626262;">Ext. coefficient    13785
Abs 0.1% (=1 g/l)   0.763, assuming all pairs of Cys residues form cystines
 
 
Ext. coefficient    13410
Abs 0.1% (=1 g/l)   0.742, assuming all Cys residues are reduced
</span>

TMpred graph Image (http://www.ch.embnet.org/software/TMPRED_form.html). Input your amino acid sequence to it.
Graph.jpg
*CDS Gene Sequence (paste as text only):
ATGAAACTGCTGTTCGTTTGCCTGGGTAACATCTGCCGTTCTCCGGCTGCGGAAGCGGTT
ATGAAAAAAGTTATCCAGAACCACCACCTGACCGAAAAATACATCTGTGACTCTGCGGGT
ACCTGCTCTTACCACGAAGGTCAGCAGGCGGACTCTCGTATGCGTAAAGTTGGTAAATCT
CGTGGTTACCAGGTTGACTCTATCTCTCGTCCGGTTGTTTCTTCTGACTTCAAGAACTTT
GACTACATCTTCGCGATGGACAACGACAACTACTACGAACTCCTGGACCGTTGCCCGGAA
CAGTACAAACAGAAAATCTTCAAAATGGTAGACTTCTGCACCACCATCAAAACCACCGAA
GTTCCGGACCCGTACTACGGTGGTGAAAAAGGTTTCCACCGTGTTATCGACATCCTGGAG
GACGCGTGCGAAAACCTGATCATCAAACTGGAAGAAGGTAAACTGATCAACTAA
*GC% Content for gene: 47.89%
*CDS Gene Sequence (codon optimized) - copy from output of Primer Design Protocol (paste as text only):
NOTE - this seems to have the 'tail' on the beginning and end for cloning. So, start at ATG and end at TAA
TACTTCCAATCCATGAAACTGCTGTTCGTTTGCCTGGGTAACATCTGCCGTTCTCCGGCTGCGGAAGCGGTT
ATGAAAAAAGTTATCCAGAACCACCACCTGACCGAAAAATACATCTGTGACTCTGCGGGT
ACCTGCTCTTACCACGAAGGTCAGCAGGCGGACTCTCGTATGCGTAAAGTTGGTAAATCT
CGTGGTTACCAGGTTGACTCTATCTCTCGTCCGGTTGTTTCTTCTGACTTCAAGAACTTT
GACTACATCTTCGCGATGGACAACGACAACTACTACGAACTCCTGGACCGTTGCCCGGAA
CAGTACAAACAGAAAATCTTCAAAATGGTAGACTTCTGCACCACCATCAAAACCACCGAA
GTTCCGGACCCGTACTACGGTGGTGAAAAAGGTTTCCACCGTGTTATCGACATCCTGGAG
GACGCGTGCGAAAACCTGATCATCAAACTGGAAGAAGGTAAACTGATCAACTAACAGTAAAGGTGGATA
*GC% Content for gene (codon optimized): 47.5%

CODON OPTIMIZED SEQUENCE from
DNA Works Output File: AF_EhistolyticaPTP_61314
 The DNA sequence #   1 is:
 ----------------------------------------------------------------
   1 ATGAAACTGCTGTTCGTTTGCCTGGGTAACATCTGCCGTTCTCCGGCTGCGGAAGCGGTT
  61 ATGAAAAAAGTTATCCAGAACCACCACCTGACCGAAAAATACATCTGTGACTCTGCGGGT
 121 ACCTGCTCTTACCACGAAGGTCAGCAGGCGGACTCTCGTATGCGTAAAGTTGGTAAATCT
 181 CGTGGTTACCAGGTTGACTCTATCTCTCGTCCGGTTGTTTCTTCTGACTTCAAGAACTTT
 241 GACTACATCTTCGCGATGGACAACGACAACTACTACGAACTCCTGGACCGTTGCCCGGAA
 301 CAGTACAAACAGAAAATCTTCAAAATGGTAGACTTCTGCACCACCATCAAAACCACCGAA
 361 GTTCCGGACCCGTACTACGGTGGTGAAAAAGGTTTCCACCGTGTTATCGACATCCTGGAG
 421 GACGCGTGCGAAAACCTGATCATCAAACTGGAAGAAGGTAAACTGATCAACTAA

Primer design results for pNIC-Bsa4 cloning (list seqeunces of all of your ~40 nt long primers):
aef_EhPTP_oligoprimerorder_plateB_070914.PNG


Primer design results for 'tail' primers (this is just 2 sequences):
Forward Primer:
TACTTCCAATCCATGAAACTGCTGTTCG28 bp

Reverse Primer:
GGTAAACTGATCAACTAACAGTAAAGGTGGATA ' 36 bp



Resources:
See ProtocolTargetDiscoveryVDS.docx for more
Etiologic Risk Group Categories (for pathogens): http://www.utexas.edu/research/rsc/ibc/agent_class.html#_Toc7238334
Scientific Journals:
http://www.sciencedirect.com/science/article/pii/S0020751903000298
http://www.ncbi.nlm.nih.gov/pubmed?cmd=Search&term=12814646
Databases of genes/organisms:
http://www.niaid.nih.gov/Pages/default.aspx
http://eupathdb.org/eupathdb/
https://patricbrc.vbi.vt.edu/portal/portal/patric/Home
http://www.nmpdr.org/FIG/wiki/view.cgi/Main/EssentialGenes
http://tubic.tju.edu.cn/deg/
http://csgid.org/csgid/cake/pages/community_request_gateway
http://tdrtargets.org/
http://gsc.jcvi.org/status.shtml
Scientific Nomenclature page from Center for Disease Control (gene, protein names and abbreviations)
http://wwwnc.cdc.gov/eid/pages/scientific-nomenclature.htm
Gene Information:
Genome: http://www.ncbi.nlm.nih.gov/nuccore/183235131?report=fasta
NCBI GENE Page: http://www.ncbi.nlm.nih.gov/gene
BLAST Page: http://blast.ncbi.nlm.nih.gov/
Protein Information:
Protein Sequence: http://www.ncbi.nlm.nih.gov/protein/472460682?report=fasta
NCBI Protein Page: http://www.ncbi.nlm.nih.gov/protein
Protein Expression Website
Protein Expression Paper: SGC_ProteinProductionPurificationNatMethods2008.pdf
Primer Overlap PCR Articles
HooverLubkowski_PCRoverlapcloninggnf042.pdf
StemmerPCRoverlapGene1995.pdf
Is my target good for Virtual Screening programs?
Reynolds_THermodynamicsLigandBinding_MedChemLett2011.pdf